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We  establish  general  conditions  for  the  asymptotic  validity  of  sequential  stopping  rules 
to  achieve  fixed-volume  confidence  sets  for  simulation  estimators  of  veaor-valued 
parameters.  The  asymptotic  validity  occurs  as  the  prescribed  volume  of  the  confidence  set 
approaches  zero.  There  are  two  requirements:  a  functional  central  limit  theorem  for  the 

estimation  process  and  strong  consistency  (with-probability-one  convergence)  for  the 

% 

variance  or  “scaling  matrix'*  estimator.  Applications  are  given  for:  sample  means  of  i.i.d. 
random  variables  and  random  vectors,  nonlinear  functions  of  such  sample  means, 
jackknifing,  Kiefer-Wolfowitz  and  Robbins-Monro  stochastic  approximation,  and  both 
regenerative  and  non-regenerative  steady-state  simulation. 
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1.  INTRODUCTION 

The  run-length  of  a  stochastic  simulation  is  typically  determined  by  one  of  two 
methods.  The  first  method  is  to  assign  the  run-length  prior  to  performing  the  simulation. 
This  is  usually  done  by  specifying  either  the  amount  of  simulation  time  to  be  generated  or 
the  amount  of  computer  time  to  be  expended.  The  principal  disadvantage  of  this  approach 
is  that  the  posterior  precision  of  the  estimator  may  not  be  appropriate.  Since  the  volume 
of  the  confidence  tet  (the  width  of  a  confidence  interval  in  one  dimension)  is  unknown  in 
advance,  the  volume  may  be  too  large  to  be  of  practical  use  (meaning  that  the  pre-assigned 
run-length  was  too  small)  or  too  small  (meaning  that  computational  resources  were  wasted 
in  refining  the  estimator  beyond  the  level  of  accuracy  required) . 

The  second  method  is  a  sequential  stopping  procedure;  i.e.,  we  let  the  simulation  run 
until  the  volume  of  a  confidence  set  achieves  a  prescribed  value.  This  avoids  the  problems 
associated  with  pre-assigned  run-lengths,  but  new  difficulties  are  introduced  because  the 
run-length  is  now  randomly  determined.  The  first  difficulty  is  that  we  no  longer  have 
direct  control  of  the  amount  of  simulation  time  to  be  generated  or  the  amount  of  computer 
time  to  be  expended.  Consequently,  the  run-length  may  turn  out  to  be  much  longer  than 
we  want.  On  the  other  hand,  it  is  possible  that  the  run-length  may  turn  out  to  be 
inappropriately  short.  This  creates  certain  statistical  difficulties  that  can  compromise  the 
accuracy  of  such  procedures.  For  example,  it  is  known  that  in  many  statistical  settings, 
the  point  estimator  and  the  variance  estimator  are  positively  correlated.  Since  the  volume 
of  a  confidence  set  is  typically  determined  by  the  variance  estimator,  this  suggests  that  the 
confidence  set  volume  will  tend  to  be  small  when  the  point  estimator  is  small. 
Consequently,  the  resulting  sequential  prcu;uurc  will  tend  to  terminate  early  in  situations 
in  which  the  point  estimator  is  too  small,  leading  to  possibly  significant  coverage  problems 
for  the  confidence  sets.  Nevertheless,  sequential  stopping  rules  are  of  interest  because  of 
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the  possibility  of  automatically  obtaining  prescribed  precision. 

Various  sequential  stopping  rules  for  simulation  estimators  have  been  proposed  and 
investigated  empirically.  Among  these  are  sequential  procedures  involving:  batch  means 
in  Law  and  Carson  (1979)  and  Law  and  Kelton  (1982),  regenerative  simulation  in 
Fishman  (1977)  and  Lavenberg  and  Sauer  (1977),  and  spectral  methods  in  Heidelberger 
and  Welch  (1981a, b,  1983);  see  pp.  81,  92,  97,  103  of  Bratley,  Fox  and  Schrage  (1987)  for 
an  overview.  Unfortunately,  however,  the  empirical  evidence  is  not  entirely  encouraging. 
Evidently,  care  must  be  taken  in  the  design  and  implementation  of  sequential  procedures 
to  avoid  inappropriate  early  termination.  On  the  positive  side,  the  sequential  procedures 
do  tend  to  perform  well  when  the  run  lengths  are  relatively  long,  which  is  achieved  in  part 
by  having  a  suitably  small  prescribed  volume  for  the  confidence  set. 

The  observed  good  performance  with  small  prescribed  confidence  set  volumes  is 
consistent  with  the  classical  asymptotic  theory  of  sequential  procedures  for  obtaining 
fixed-width  confidence  intervals  for  the  mean  of  independent  and  identically  distributed 
(i.i.d.)  real-valued  random  variables;  see  Anscombe  (1952,  1953),  Chow  and  Robbins 
(1965),  Starr  (1966),  Nadas  (1969)  and  Chapter  Vn  of  Siegmund  (1985).  This  asymptotic 
theory  establishes  that  the  sequential  procedure  is  indeed  asymptotically  valid  as  the 
prescribed  width  of  the  confidence  interval  approaches  zero  (and  the  resulting  run  length 
approaches  infinity).  This  classical  asymptotic  theory  provides  a  theoretical  basis  for 
confidence  in  sequential  procedures,  but  it  is  not  directly  relevant  to  most  simulation 
estimators,  because  the  classical  asymptotic  theory  is  for  i.i.d.  random  variables.  The 
classical  theory  does  apply  relatively  directly  to  regenerative  simulations,  as  was  shcwTi  by 
Lavenberg  and  Sauer  (1977),  but  there  evidently  is  not  yet  any  asymptotic  theory  for  non- 
regenerative  steady-state  simulation  estimators. 


The  purpose  of  this  paper  is  to  fill  the  gap.  We  provide  general  conditions  for  the 
asymptotic  validity  of  sequential  stopping  rules  for  a  large  class  of  simulation  estimators. 
The  mam  conditions  are  that  the  estimation  process  obey  a  functional  central  limit  theorem 
(FCLT)  and  that  there  be  a  strongly  consistent  estimator  for  the  asymptotic  variance  of  the 
estimator.  (We  also  treat  d-dimensional  parameters;  then  the  asymptotic  variance  should 
be  replaced  by  an  asymptotic  covariance  matrix  or,  equivalently,  by  an  associated 
“scaling”  matrix;  see  §2.)  Alternatively,  for  the  variance  estimator  it  suffices  for  it  to 
satisfy  a  functional  weak  law  of  large  numbers  (FWLLN),  which  is  often  obtained  as  a 
consequence  of  a  FCLT.  The  strong  consistency  (w.p.l  convergence)  or  the  FWLLN  for 
the  variance  estimator  are  important;  we  provide  a  counterexample  in  §4  showing  that 
asymptotic  validity  need  not  hold  with  only  weak  consistency  (ordinary  one-dimensional 
in-probability  convergence).  Indeed,  the  conditions  here  are  natural  for  the  random-time- 
change  limit  theorems  upon  which  the  proofs  depend;  e.g.,  see  Richter  (1965),  §17  of 
Billingsley  (1968),  §3,5  of  Whitt  (1980),  Glynn  and  Whitt  (1988)  and  Gut  (1988). 

The  rest  of  this  paper  is  organized  as  follows.  Section  2  provides  limit  theorems  which 
guarantee  that  the  coverage  of  a  sequential  procedure  converges  to  the  desired  level  when 
the  prescribed  volume  of  the  confidence  set  is  shrunk  to  zero.  Section  3  contains 
applications  of  the  limit  theorems  to  various  estimation  settings.  We  give  conditions  under 
which  sequential  stopping  rules  are  valid  for  a  variety  of  estimation  problems  not 
previously  considered:  estimation  of  a  nonlinear  function  of  means,  non-regenerative 
steady-state  simulation,  and  jackknife  estimators.  Section  4  contains  the  counterexample 
when  the  variance  estimator  is  only  weakly  consistent  Finally,  Section  5  contams  all 
proofs. 
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2.  THE  FRAMEWORK  AND  MAIN  LIMIT  THEOREMS 

Our  goal  is  to  estimate  a  parameter  a  €  R**.  We  assume  that  there  exists  an  -valued 
stochastic  process  T  «  {y(r) :  r  s  0}  called  the  estimation  process  for  which  Y{t)  a  as 
r  -  where  =>  denotes  convergence  in  distribution  (which  here  coincides  with 
convergence  in  probability  because  a  is  deterministic).  Actually,  we  shall  need  to  require 
that  the  estimation  process  satisfy  a  stronger  hypothesis,  in  particular,  a  functional  central 
limit  theorem  (FCLT);  in  most  applications,  it  will  effectively  amount  to  assuming  that  the 
estimation  process  satisfies  an  ordinary  central  limit  theorem  (CLT). 

Let  D(0,  *)  be  the  space  of  right-continuous  R‘^-valued  funttions  with  left  limits  on  the 

open  interval  (0,*),  endowed  with  the  standard  Jy  topology;  see  Ethier  and  Kurtz  (1986) 

or  Whitt  (1980).  We  work  with  D(0,*)  rather  than  D[0,*)  in  order  to  avoid  having  to 

deal  with  possible  singularities  in  T  at  the  origin  r  0;  e.g.,  in  estimators  such  as 
/ 

T(r)  =  f~^/Z(j)dj.  At  continuous  limits,  convergence  in  Z3(0,  *)  is  equivalent  to 
0 

uniform  convergence  over  compaa  subintervals  of  (0,*).  We  assume  that  Y  has  sample 
paths  in  D(0,  x)  and  consider  the  family  of  scaled  processes 

^,(r)  =  e-MF(//e)  -  a),  r  >  0, 

in  D(0,  *)  for  e  >  0.  We  assume  that: 

(2.1)  There  exists  a  nonsingular  dxd  matrix  F,  a  constant  y  >  0,  and  an  R^-valued 
process  ■  {‘3/(r):  r  >  0}  that  is  continuous  at  t  w.p.l  for  all  t  such  that 
\  in  Z?(0,  *)  as  €  iO,  which  we  denote  by 

'?/t(r)  ■  (Ffr/a)  -  a)  =»►  r%t)  in  D(0,  *)  as  eiO. 

Typically  the  rate  of  convergence  in  (2.1)  is  7  »  1/2  and  the  limit  process  takes  the 
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form  ‘3/(r)  =  where  B  is  standard  Brownian  motion  (which  is  composed  of  d 

independent  1 -dimensional  Brownian  motions,  each  having  zero  drift  and  unit  diffusion 
coefficient),  but  these  are  not  required.  Note  that  assumption  (2.1)  guarantees  that 

y(f) -a  =  r~^^i/,(I)  =>  o  rqy(l)  =  0  as  r-*; 

i.e.,  Y{t)  a  as  r  -  so  that  Y{t)  is  a  weakly  consistent  estimator  of  a. 

First,  suppose  that  we  pre-assign  the  amount  of  simulation  time  r  to  be  generated  by 
the  computer.  To  obtain  an  approximate  1CX)(1-S)%  confidence  set  for  ct,  we  assume 
that  there  exists  a  bounded  set  A  for  which 

(2.2)  P{«3/(1)  6A}  =  1-8  and  />{^1)  €  flA}  =  0 , 

where  aA  is  the  boundary  of  A,  Then,  let  C(r)  =  T(r)-r"^rA,  where  we  use  the 
notation  r  +  QA  to  denote  the  set  {x  €  :  3  y  6  A  such  that  x  »  z  +  gy}  for  z  ^  and 

d  X  d  matrices  Q.  The  following  proposition  shows  that  C(r)  achieves  the  nominal 
coverage  level  as  the  sample  size  t  is  permitted  to  go  to  infmity. 

PROPOSITION  1.  Under  (2.1)  and  (2.2),  F{a  €  C(f)}  -  1  -  8  as  r  -  x. 

Of  course,  in  applications,  F  is  typically  unknown  so  that  it  must  be  estimated. 
Assume  that  there  exists  an  estimator  Ffr)  which  is  weakly  consistent,  i.e.,  r(r)  F  as 
r  -  X.  Let  C(f)  =  Y(t)  -  r-'F(r)A. 

PROPOSITION  2.  If  F(f)  F  as  f  -  *  under  (2.1)  and  (2.2),  then  /»{o  €  C(r)}  -  1  -  8 
as  r  -  X. 

Thus,  the  confidence  set  C(r)  yields  a  procedure  which  is  both  implementable  and 


provides  an  asymptotically  valid  region. 
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Remark  (2.1).  Propositions  1  and  2  actually  require  only  that  CLT  version  of  (2.1),  i.e., 
'<•^*(1)  =>  ^1)  in  R'*  as  €  -  0;  see  the  proofs  in  §5.  ■ 

We  turn  now  to  a  discussion  of  sequential  stopping  procedures.  For  a  generic 
(measurable)  set  let  m(B)  denote  the  J-dimensional  volume  (Lebesgue  measure) 

of  the  set.  Of  course,  when  d  =  I  and  B  is  an  interval,  miB)  is  just  the  length  of  the 

interval.  We  first  consider  the  case  in  which  the  procedure  terminates  when  the  root 

of  the  volume  of  the  confidence  region  C(r)  drops  below  a  prescribed  level  e.  (It  is 
natural  to  use  the  d^  root,  because  m{xB)^''^  =  xm{B)^'^  for  a  scalar  x.)  We  call  such  a 
procedure  an  absolute-precision  sequential  stopping  rule.  For  such  a  rule,  the  time  Tit)  at 
which  the  simulation  terminates  execution  is  defined  by 

f(t)  =  inffr  a  0:  miCit))^'^  s  e}  , 

Actually,  this  stopping  rule  needs  to  be  modified,  because  Tit)  can  terminate  much  too 
early  if  the  estimator  r(f)  is  badly  behaved  for  small  t.  To  see  this,  suppow  that 
PiFil)  =  0,  midt))  =  1,  0  s  r  <  1)  =  1.  In  this  case,  fit)  =  1  for  e  <  1,  so 
afit))  =  y(l)  for  €  <  1.  Hence,  in  this  example,  P(o  €C(7’(e)))  =  Pia  =  Fd))  for 

e  <  1.  Hence,  convergence  of  the  coverage  probability  of  the  region  C(r(6))  to  the 

nominal  level  1  ~  8  does  not  occur  when  we  let  t  ^0. 

In  order  for  the  asymptotic  theory  associated  with  (2.1)  to  be  relevant  to  the  sequential 
stopping  problem,  it  is  necessary  that  Tit)  as  eiO.  In  other  words,  small  values  of 
the  precision  constant  e  need  to  correspond  to  large  values  of  simulation  time.  We  can 
force  the  termination  time  to  behave  in  this  way  if  we  inflate  the  volume  m(C(r))  slightly. 
Let  ait)  be  a  strictly  positive  function  that  decreases  monotonically  to  zero  as  r  ~  »  and 
satisfies  a(f)  =  o(f“''').  Set 
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(2.3)  ri(€)  =  inf{r  >  0;  ^  ait)  <  €> 

and 

ri(€)  =  inf{r  s  0:  air)  s  c}. 

Note  that  TjCe)  a  tjCe)  -  *  as  €*0.  Thus,  the  early  termination  associated  with  Tit)  is 
prevented  by  the  stopping  rule  Tiit).  In  praaice,  one  might  use  a  priori  analytical 
estimates  of  required  simulation  run  lengths,  as  in  Whitt  (1989a,b),  to  determine  the 
function  n(r):  e.g.,  ait)  =  1  for  r  ss  Tq  and  ait)  =  for  t  a  Tq;  but  we  do  not 

examine  specific  procedures  here. 

Throughout  the  rest  of  this  paper,  we  assume  that  m(A)  >  0  for  A  in  (2.2).  The  next 
two  theorems  provide  our  main  asymptotic  results  about  sequential  stopping  rules  for 
stochastic  simulations.  The  first  theorem  shows  that  under  the  relatively  mild  assumption 
(2.1),  strong  consistency  of  the  estimator  Fit)  suffices  to  guarantee  that  the  absolute- 
precision  stopping  rule  ri(€)  has  a  variety  of  desirable  asymptotic  properties,  including 
asymptotic  validity. 

THEOREM  1.  Suppose  that  (2.1)  and  (2.2)  hold.  If  r(r)  -  F  w.p.l  as  r  -  then  as  r  -  * 
or  e  -  0 

(i)  [midt))^'^  +  0(0]  -  rnfFA)*'*'  w.p.l, 

(ii)  e''^ri(e) -m(rA)‘'^^  w.p.l, 

(iii)  €-‘m(C(rj(6)))‘'‘' -  1  w.p.l, 

(iv)  €"*  [y(r,(e))  -  a]  m(rA)"*''^rqy(l)  in 

(v)  P ia  ^  C iT lie.)))  -  1  -  8  (asymptotic  validity). 

Part  (i)  shows  that  w.p.l  both  the  d***  root  of  the  volume  and  the  inflated  d'^  root  of 
the  volume  of  the  confidence  set  at  time  t  are  asymptotically  equal  to  r"' m(rA)’'‘'  to  first 
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order  as  r  -  Part  (ii)  shows  that  w.p.l  the  termination  time  Ti(€)  is  asymptotically 
equal  to  d(€)  ~  m(r to  first  order  as  c  -  0.  Note  that  d(e)  is  precisely  that 
deterministic  time  at  which  m(C(d(€)))^'^  =  e.  Of  course,  the  time  d(€)  is  not  directly 
implementable  because  P  is  presumed  unknown.  Part  (iii)  shows  that  w.p.l  the  d^  root 
of  the  volume  of  the  final  realized  confidence  set,  m(C(ri(t))),  is  asymptotically  equal  to 
the  prescribed  value  e  to  first  order  as  «  -  0.  Part  (iv)  is  a  random-time  change  CLT  that 
serves  to  establish  the  asymptotic  validity  in  (v).  Thus,  the  sequential  stopping  rule  TiCe) 
provides  the  same  asymptotic  behavior  as  ^(e),  despite  the  fact  that  7*1  (c)  needs  to 
estimate  P. 

Remark  (2.2).  The  FCLT  (2.1)  instead  of  the  ordinary  CLT  is  needed  to  establish  the 
randora-time-change  CLT  in  part  (iv)  of  Theorem  1;  e.g.,  see  Example  4  of  Glynn  and 
Whitt  (1988).  ■ 

Our  next  result  shows  that  we  can  replace  the  strong  consistency  of  P(f)  in  Theorem  1 
with  a  functional  weak  law  of  large  numbers  (FWLLN).  A  FWLLN  is  easily  obtained  as 
a  corollary  to  a  FCLT.  Since  an  ordinary  strong  law  of  large  numbers  (SLLN)  is 
equivalent  to  a  FSLLN  (see  Theorem  4  of  Glynu  and  Whitt  (1988)  or  Lemma  3  of  Glynn 
and  Whitt  (1989)),  which  in  turn  implies  a  FWLLN,  we  also  obtain  the  following  results 
under  the  strong  consistency  assumption  of  Theorem  1  too.  Moreover,  the 
convergence  in  (i)-(iii)  can  then  be  replaced  with  w.p.l.  However,  a  FWLLN  need  not 
imply  a  SLLN  (see  Example  2  of  Glynn  and  Whitt  (1988)),  so  that  the  condition  of 
Theorem  2  is  actually  more  general.  We  obtain  convergence  statements  with  r  =  1 
paralleling  those  of  Theorem  1  by  simply  applying  the  continuous  mapping  theorem  with 
the  projection  map  at  r  -  1 .  In  Seaion  4  we  show  that  we  cannot  simply  assume  an 
ordinary  WLLN,  i.e.,  that  P(r)  is  weakly  consistent.  Let  =  denote  equality  in 


distribution. 
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THEOREM  2.  Suppose  that  (2.1)  and  (2.2)  hold.  If  T(tli)  Y  in  D(0,  x)  with  range 
as  €  -  0,  then  as  e  -  0 

(i)  [m(C(r/€))*'‘'  +  a(r/€)]  =>  m{TA)^’‘^  inZ)(0,  x), 

(ii)  €i'>ri(e/r)  =>  f*'^m(rA)*'''‘'  mD(O.x'), 

(iii)  €"‘m(C(ri(€/r)))''‘' =i>  r"’  inD(0,x). 

(iv)  €-' [y(ri(€/r))  -  a]  r^/(r’'^m(r>i )*'>'')  =m(r>i)-’'‘'r^r*'>)  in 

D(O.x), 

(v)  Pia^C(J\{i)))  -  1  -  8  (asymptotic  validity). 

Our  next  theorem  considers  a  variant  of  the  stopping  rule  T j  (c)  known  as  a  relative- 
precision  sequential  stopping  rule.  The  basic  idea  here  is  that  the  simulation  should 
terminate  when  the  d'*'  root  of  the  volume  of  the  confidence  region  is  less  than  an 
fraction  of  the  norm  of  the  parameter  a.  Since  /(r)  is  an  estimator  for  o,  this  suggests 
replacing  Tife)  with 

(2.4)  r2(€)  =  inf{r  a  0:  m{C{t))^'^  ^  a{t)  s  €||r(r)||}  . 

The  next  theorem  shows  that  such  relative-precision  stopping  rules  have  analogous 
asymptotic  behavior  to  that  exhibited  by  absolute-precision  stopping  rules.  Note  that  T2(€) 
behaves  asymptotically  like  ri((ja|(e),  as  one  would  expect.  For  the  analog  of 
Theorem  1,  it  is  important  that  y(r)  now  be  a  strongly  consistent  estimator  of  a.  This  is  a 
reasonable  condition,  but  it  does  not  follow  from  (2.1);  e.g.,  see  Example  2  of  Glynn  and 
Whitt  (1988). 

THEOREM  3.  Suppose  that  (2.1)  and  (2.2)  hold.  If  y(t)-a  and  r(r)-.r  w.p.l  as 
r  -  X,  where  (|a  1|  >  0,  then  as  r  -  x  and  c  -  0 


(i)  rMm(C(f))>'‘'  -h  a(r)]/||y(r)ll  -  llair’ m(rA)‘'‘^  w.p.l. 
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(ii)  w.p.l, 

(iii)  €-'m(C(r2(£)))''''-  'iail  w.p.l, 

(iv)  €-'[y(72(€))  -  a]  =>  ||a||r.i(rA)-‘'‘'rqy(l)  inR-^, 

(v)  P(a  ^  C(r2(e))>  -  1  -  8  (asymptotic  validity). 

The  proof  of  Theorem  3  is  a  minor  modification  of  the  proof  of  Theorem  1,  and  is 
therefore  omitted.  There  is  also  an  analog  of  Theorem  2  for  the  relative-precision 
sequential  stopping  rules,  but  we  do  not  state  it.  Nothing  beyond  the  assumptions  of 
Theorem  2  is  needed  except  ||a|l  >  0. 

3.  EXAMPLES 

In  this  sertion,  we  discuss  the  implications  of  Theorems  1  and  3  in  a  variety  of 
different  estimation  settings.  As  we  shall  see,  assumption  (2.1)  is  a  mild  hypothesis  that  is 
satisfied  in  virtually  all  practical  contexts.  Given  the  presence  of  such  a  FCLT,  the 
asymptotic  validity  of  a  sequential  stopping  rule  basically  depends  upon  the  availability  of 
a  strongly  consistent  estimator  for  the  scaling  matrix  F  that  appears  in  (2.1). 
(Alternatively  we  could  have  a  FWLLN,  as  in  Theorem  2.) 

Example  1.  The  Sample  Mean  of  i.i.d.  Random  Variables. 

■S’ippose  that  a  can  be  represented  as  a  =  EX  for  some  real-valued  r.v.  X.  For 
example,  a  might  correspond  to  the  expected  number  of  customers  served  in  a  queue  over 
the  time  interval  [0,7].  Then  a  can  be  estimated  by  generating  i.i.d.  replicates  Xi,X2. ... 

of  the  r.v.  X;  the  resulting  estimator  for  a  is  then  the  sample  mean  X„  -  n“’  2 

i-\ 

corresponding  estimation  process  is  7(0  =  Xpj,  where  [rj  is  the  greatest  integer  less  than 
t  and  Xo  =  0.  If  EX^  <  *,  Donsker’s  theorem  asserts  that  (2.1)  holds  with  y  =  1/2, 
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r  =  a  where  =  varX,  and  '^(r)  =  B(t)lt  where  Bit)  is  standard  (zero-drift,  unit- 
diffusion-coefficient)  Brownian  motion;  see  §16  of  Billingsley  (1968).  Note  that 
^(1)  =A^(0,1).  The  typical  choice  for  the  set  A  in  this  setting  is  the  interval 
[-r(S),  z(8)],  where  r(8)  is  chosen  to  satisfy  P{NiO,  1)  £  r(8)}  =  1  -  8/2  .  Of  course,  it 
is  well  known  that 


r„ 


1 

n-l 


n 


2 

/-I 


ll/2 


a  w.p.l  as  n 


* . 


Suppose  that  cr^  >  0.  Setting  r(r)  =  F^,],  we  have  the  strong  consistency  required  by 
Theorems  1  and  3.  Hence,  both  the  absolute  and  relative  precision  stopping  rules  T^it) 
and  72(e)  are  asymptotically  valid  for  this  example  when  the  precision  constant  e  shrinks 
to  zero.  In  this  setting,  Theorems  1  and  3  reproduce  the  results  of  Chow  and  Robbins 
(196S),  Starr  (1966),  and  Nadas  (1969).  Implementation  considerations  are  discussed  in 
Law,  Kelton,  and  Koenig  (1981). 


Example  2.  The  Sample  Mean  of  i.l.d.  Random  Vectors. 

Now  we  consider  the  case  in  which  a  can  be  represented  as  a  =  £X  where  X  is 
valued.  Assume  that  £||X||^  <  «.  As  in  Example  1.  we  can  estimate  a  via  the  sample 

mean  X„  *  n“*  where  the  X('s  are  i.i.d.  copies  of  X.  Setting  y(r)  =  X|^,j,  we 

obtain  (2.1)  from  the  d-dimensional  version  of  Donsker’s  theorem  (e.g.,  see  Glynn  and 
Whitt  (1987))  where  ^t)  *  Bit)lt,  B  is  d-dimensional  standard  Brownian  motion 
(composed  of  d  independent  1-dimensional  standard  Brownian  motions)  and  FF'  is  the 
covariance  matrix  C  of  X.  We  assume  that  C  is  positive  definite.  Note  that 
^1)  »  B(l)  *V(0,/).  In  this  d-dimensional  setting,  we  can  assume  that  A  is  the  d- 
sphere  {x;  ||x||  s  vv(8)},  where  ^(8)  is  chosen  so  that 
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/»{||N(0,;)1|2  <  w2(S)}  =  pml  s  w2(8)}  =1-8, 
with  being  a  chi-square  r.v.  with  d  degrees  of  freedom.  Let 

Cn  =  ^j:X,X\-XX, 
n 

(writing  all  d-vectors  as  column  vectors).  Then,  C„  -  C  a.s.  as  n  -  Let  r„  be  obtained 
by  taking  the  Cholesky  factorization  of  C„,  so  that  r„  is  a  lower  triangular  matrix  such 
that  C„  =  r„ri,  see  p.  164-165  of  Bratley,  Fox  and  Schrage  (1987).  It  follows  that 
r„  -  r  w.p.l  as  n  ->  %,  since  Cholesky  factors  are  continuous  at  positive  defmite  matrices. 
Setting  r(r)  =  r^,j,  we  again  have  the  strong  consistency  required  by  Theorems  1  and  3. 
Thus,  we  have  proved  that  the  absolute  and  relative  precision  stopping  rules  7'i(e)  and 
r2(€)  are  asymptotically  valid  for  sequential  stopping  of  multiple  performance  measure 
stochastic  simulations. 

Example  3.  Functloiis  of  Sample  Means. 

Let  X  be  an  R^-valued  random  vector  and  let  p.  «  EX.  Suppose  that  a  can  be 
represented  as  a  =  g(p,)  for  some  (known)  real-valued  function  g:  ->  R.  An  example 

of  this  occurs  in  the  ratio  estimation  setting,  in  which  d  ^  2  and  g  (x,  y)  =  xly.  Because 
the  steady-state  of  a  regenerative  stochastic  process  can  be  expressed  as  a  ratio  of  two 
means,  this  estimation  setting  subsumes  that  of  regenerative  steady-state  simulation.  Of 
course,  this  observation  lies  at  the  heart  of  the  regenerative  method  of  steady-state 
simulation  e.g.,  see  Crane  and  Lemoine  (1977). 

In  this  nonlinear  setting,  we  estimate  a  via  Y{t)  =  g(Xi,j),  where  X/  are  i.i.d.  random 
vectors  as  in  Example  2.  Suppose  that  £||X||^  <  sc  and  that  g  is  continuously 
differentiable  in  a  neighborhood  of  p..  In  addition,  we  require  that  Vg(pL)  #  0  and  that 
the  covariance  matrix  C  of  X  is  positive  defmite.  Then  Theorem  3  of  Glynn  and 
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Whitt  (1989)  implies  that  (2.1)  holds  with  y  -  1/2,  ^(r)  =  and  T  =  cr  as  in 

Example  1,  but  with  a  =  (Vg(n)'CV^(lx))''^. 

Let  C„  be  defined  as  in  Example  2  and  note  that  [V^(y(r))'C[,j  V^(y(r))]''^  -  o-  w.p.l 
as  r->*.  Hence,  we  have  the  strong  consistency  required  for  the  application  of 
Theorems  1  and  3.  As  a  consequence,  we  are  assured  th.  me  stopping  rules  T lii)  and 
72 (t)  are  asymptotically  valid  in  this  estimation  setting.  In  particular,  in  the  regenerative 
simulation  setting,  we  recover  the  asymptotic  theory  developed  by  Lavenberg  and  Sauer 
(1977). 

Example  4.  The  Jackknife. 

Consider  the  estimation  problem  of  Example  3  in  which  our  goal  is  to  estimate 
a  =  g(M>).  where  p.  can  be  expressed  as  |x  =  and  g  is  real-valued.  One  practical 

difficulty  with  the  estimator  suggested  in  Example  3  is  that  it  tends  to  be  significantly 
affected  by  bias  problems  induced  by  the  presence  of  the  nonlinearity  in  g.  One  way  to 
address  the  small-sample  bias  problem  that  this  nonlinearity  creates  is  to  jackknife  the 
estimator.  Specifically,  let  a(n)  *  g(X„)  and,  for  1  s  i  s  n,  let 

(3.1)  Xt„  =  and  «/(”)  =  na(n)  -  (n  -  l)a,(n) . 

”  1  Jm\ 

J*i 

n 

Then  the  estimator  y«  =  n"'  2  “iC")  ‘s  jackknife  estimator  of  a.  Let  Y{t)  *  ^UJ-  I* 

1-1 

is  shown  in  Glynn  and  Heidelberger  (1989)  that  if  <  *  and  g  is  twice 

continuously  differentiable  in  a  neighborhood  of  p..  then  (2.1)  holds  where  a  and  ^t)  are 
as  in  Example  3.  Since  the  form  of  the  FCLT  (2.1)  is  the  same  as  for  Example  3,  the 
jackknife  has  the  same  asymptotic  efficiency  as  the  estimator  of  Example  3.  However,  as 
argued  in  Miller  (1974),  the  jackknife  estimator  typically  possesses  superior  small-sample 
bias  properties. 


-  14  - 


Two  estimators  for  the  scaling  constant  a  =  [Vg(M,)'CVg(|x)]^'^  are  possible.  One 
approach  is  to  use  the  estimator  a(f)  =  [Vg(y(r))‘C[,j Vg(y(r))]*'^^  suggested  in 
Example  3.  Theorem  4(i)  of  Glynn  and  Heidelberger  (1989)  shows  that  y(r)  -  a  w.p.l  as 
r  -  under  the  conditions  stated  here.  Since  C„  -  C  w.p.l,  it  follows  that  cr(r)  -  cr  w.p.l 
as  r  -  Hence,  sequential  stopping  procedures  based  on  the  jackknife  point  estimator 
and  the  “variance”  estimator  (T^(r)  are  asymptotically  valid  by  Theorems  1  and  3, 
provided  that  >  0. 

An  alternative  estimator  for  the  scaling  constant  <t  is  given  by  the  jackknife  variance 
estimator  o'y(r); 

<Ty(r)  = 

Although  it  is  known  that  <r}(r)  as  r  -  »  under  suitable  regularity  conditions  (see, 

for  example.  Miller  (1964)  and  Miller  (1974)),  we  need  convergence  w.p.l  in  order  to 
satisfy  the  hypotheses  of  Theorems  1  and  3.  We  therefore  establish  the  following  result. 
THEOREM  4.  If  g  is  continuously  differentiable  in  a  neighborhood  of  |x  and  £||.^||^  < 
then,  a}(f)  -  =  Vg(|jL)'CVg(p.)  w.p.l  as  r  - 

Thus,  the  sequential  stopping  rules  ri(€)  and  r2(€)  may  be  applied  to  jackknifed  point 
estimators  in  conjunction  with  the  jackknifed  variance  estimator  a}(r). 

Example  5.  A  Steady«State  Mean. 

Suppose  that  our  goal  is  to  estimate  the  steady-state  mean  vector  a  of  an  IR^-valued 
stochastic  process  X  -  {X(r)  :  r  a  0}.  We  assume  that  X  satisfies  a  FCLT,  namely 

(3.2)  ^~'''[f‘''x(s)ds-fa)  =>rB(r)  in  D(0,«)  as  €*0, 

'•'0  '' 


where  B  is  a  standard  -valued  Brownian  motion.  It  is  easily  shown  that  (3.2)  implies 
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that 


YU)  m  f'x(s)ds  =>  a  as  r  -  * . 

•^0 

Hence,  (3.2)  implies  that  the  centering  vector  a  appearing  in  (3.2)  is  indeed  the  steady- 
state  mean  of  X.  Another  easy  consequence  of  (3.2)  is  that  (2.1)  holds  with  7  =  1/2  and 
^y(r)  =  B(r)/r. 


It  turns  out  that  (3.2)  is  typically  satisfied  for  most  “real-word”  steady-state 
simulations.  In  particular,  a  great  variety  of  different  assumptions  on  the  structure  of  the 
process  X  give  rise  to  FCLTs  of  the  form  (3.2).  For  example,  such  FCLT’s  hold  when  X 
is  regenerative  and  satisfies  suitable  moment  conditions  (see  Glynn  and  Whitt  (1987)),  or 
when  X  is  a  martingale  process  or  when  X  satisfies  appropriate  mixing  conditions  (see 
Chapter  7  of  Ethier  and  Kurtz  (1986)),  or  when  there  is  appropriate  positive  dependence 
in  the  X  process  (specifically,  when  the  X(r)’s  are  associated;  see  Newman  and 
Wright  (1981)). 

The  primarily  difficulty  in  applying  Theorems  1-3  arises  in  the  construction  of  a 
process  r(r)  such  that  r(/)  -  F  w.p.l  as  r  -  *  or  Ffr/e)  -  F  in  D(0,  *)  as  e  *0.  Since  FF' 
is  the  covariance  matrix  of  the  limiting  Brownian  motion,  this  is  equivalent  to  the 
construction  of  a  strongly  consistent  estimator  C(r)  for  the  time-average  covariance  matrix 
C  -  FF'  of  X.  In  general,  this  is  known  to  be  a  challenging  problem. 


Suppose  that  X  is  regenerative,  with  regeneration  times  0  *  tq  <  tj  <  T2  <  ... . 

,2 

<*  and  that  E(t2-ti)<*.  Let 


Suppose  that  £ 


/l|X(s)-a||dj 

Tl 


Nit)  *  max{n  ^  0:  t„  s  r}.  Then,  it  is  easily  proved  that 
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1  NU)  '• 

C(0  =  72/  [XU)  -  i'U)]  [Xis)  -  YiDVds 

^  /-It, -I 

is  strongly  consistent  for  C,  where  C  -  FF'  and  F  is  the  scaling  matrix  appearing  in  (3.2). 
Thus,  when  X  is  regenerative,  the  sequential  stopping  rules  TjCc)  and  r2(€)  are 
asymptotically  valid.  Of  course,  when  X  is  scalar,  we  already  established  this  result  in 
Example  3. 

For  non-regenerative  processes,  less  is  known  about  the  strong  consistency  of 
estimators  C(r)  for  the  steady-state  covariance  matrix.  However,  Glynn  and  Iglehart 
(1988)  and  Damerdji  ( 1989a,  b)  have  recently  used  strong  approximation  techniques  to 
establish  strong  consistency  for  a  broad  class  of  estimators  for  C.  Thus,  Theorems  1  and  3 
prove  that  these  estimators  do  indeed  lead  to  asymptotically  valid  sequential  procedures. 

Our  theory  for  this  example  provides  theoretical  support  complementing  previous  work 
by  Fishman  (1977),  Law  and  Carson  (1979),  and  Law  and  Kelton  (1982)  which  develop 
specific  empirically-based  sequential  stopping  rules  for  steady-state  simulations. 

Example  6.  Kiefer-Wolfowitz  Stochastic  Approximation. 

This  example  is  interesting,  in  part,  because  it  illustrates  that  the  FCLT  (2.1)  can  hold 
for  the  estimator  with  a  subcanonical  convergence  rate;  in  particular,  here  y  =  1/3.  For 
other  examples  of  non-canonical  estimator  convergence  rates,  see  Fox  and  Glynn  (1989) 
and  §S,6  of  Glynn  and  Whitt  (1989).  Suppose  that  we  are  given  a  real-valued  smooth 
function  ^(9),  which  can  be  represented  as  ^(9)  =  £Z(9).  Assume  that  our  goal  is  to 
compute  the  parameter  a  ■  6*  minimizing  p.  If  0  is  scalar,  we  can  apply  the  following 
Kiefer-Wolfowitz  stochastic  approximation  algorithm; 


where  {c„ :  n  s  0}  is  a  sequence  of  (deterministic)  non-negative  constants. 
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+  1  6A  |0o,Xo . e„.X„)  =  P(tZ(6o  +  A„  +  i)-Z(eo-A„*,)]/2A„  +  ,  ^A). 

Z(0o  +  ^n  +  i)  and  Z(0o-/in  +  i)  are  generated  independently  of  one  another  and 
{h„-.  n  ^  1}  is  another  sequence  of  deterministic  constants.  Suppose  that  c„  =  cn"’  and 
h„  =  hn~^'^  (c,  h  >  0).  Let  YU)  =  0^,j.  Then,  Ruppert  (1982)  shows  that  under 
suitable  regularity  conditions,  (2.1)  holds  with  y  =  1/3,  F  =  k,  '3/(r)  ==  r~*R(r^’^ ■*■'),  B  is 
a  standard  Brownian  motion,  b  =  c3"(0*),  -q  =  fc  —  5/6,  =  c^cr^ l{,2r\  +  l)(4/z^)  and 

=  2  varZ(0*). 

The  construction  of  a  strongly  consistent  estimator  for  F  *  k  involves  more  work.  For 
some  directions  on  how  to  obtain  such  an  estimator,  see  p.  189  of  Venter  (1967). 

Example  7.  Robblns-Monro  Stochastic  Approximation. 

As  in  Example  6,  suppose  that  our  goal  is  to  estimate  the  miniroizer  0*  of  a  smooth 
function  p ;  R  -  R.  However,  we  assume  here  that  we  can  represent  the  derivative  p'  as 
an  expectation,  i.e.,  there  exists  a  process  Z(6)  such  that  P'(0)  -  £Z(0).  (In  Example  6 
we  assumed  only  that  the  function  values  P(0)  could  be  represented  as  expectations.)  To 
calculate  0*  in  this  setting,  we  can  use  the  Robbins-Monro  stochastic  approximation 
algorithm: 

®»i  +  l  —  ^n~ 

where  {c„ :  n  ^  0}  is  a  sequence  of  (deterministic)  non-negative  constants  and 
P(X„^^  €A|eo.Xo . 9„.Xn)  */»(Z(0„)€A). 

Suppose  that  our  estimator  is  YU)  -  *uid  c„  =  cn“*  with  c  >  0.  Then  Ruppert 
(1982)  has  shown  that  under  suitable  regularity  hypotheses,  the  FCLT  (2.1)  holds  with 
7  -1/2,  F  *  K,  <3/(f)  =  D  -  cfl'(0*)-l,  =  c V(2D  +  1)'^ 

s  varZ(0*),  and  B  is  a  standard  Brownian  motion. 
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Construction  of  a  strongly  consistent  estimator  for  P  ss  k  follows  from  results 
established  by  Venter  (1967).  When  this  estimator  is  used,  the  sequential  stopping  rule 
ri(e)  reduces  to  one  studied  by  McLeish  (1976).  Our  analysis  of  the  rule  Tiii)  seems  to 
be  new. 

4.  A  COUNTEREXAMPLE  FOR  WEAK  CONSISTENCY 

We  have  developed  a  framework  to  analyze  the  asymptotic  behavior  of  sequential 
stopping  rules.  Our  analysis  shows  that  a  sequential  stopping  rule  is  asymptotically  well- 
behaved  if  the  estimation  process  satisfies  a  FCLT  as  in  (2.1)  and  if  there  exists  a  w.p.l  or 
FWLLN  limit  for  the  estimator  for  the  scaling  matrix  F  appearing  in  (2.1).  The  examples 
of  Section  3  strongly  suggest  that  (2.1)  is  typically  satisfied  in  applications,  but  strong 
consistency  or  FWLLN  consistency  of  the  estimator  of  F  is  often  more  difficult  to  verify. 
As  a  consequence,  it  is  natural  to  ask  whether  weak  consistency  (i.e.,  F(r)  F  as  r  ->  ac) 
is  enough. 

Unfortunately,  weak  consistency  is  not  enough.  The  difficulty  is  in  establishing  the 
in-probability  analog  of  Theorem  1  (ii).  If  ri(e)  m(FA)^''''‘*,  then  parts  (iv)  and 
(v)  of  Theorem  1  would  hold  by  the  argument  of  Theorem  17.1  of  Billingsley  (1968).  In 
a  proof  of  the  in-probability  analog  of  Theorem  l(ii),  we  cannot  conclude  that 
ri(e)^  V(ri(e))  ^  m(FA)*^‘^  when  Tjfe)  «  as  e*0  and  t''V{t)  m(FA)’'^‘^  as 

r  x;  see  Richter  (196S)  and  pp.  10-15  of  Gut  (1988)  for  counterexamples  ana  discussion. 

We  now  give  a  direct  counterexample.  Consider  Example  1  and  the  process  F(r) 
defined  there.  Let  N  be  a  unit  rate  Poisson  process  independent  of  {X/  :  i  S:  1}  and  let 
7*1 , 72 ... .  be  the  jump  times  of  the  process  /V.  Suppose  that 
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(4.1) 


f(f)  = 


no  ,  ti  U  [T„,Tn  +  \ln) 

n-l 

X 

0.  r€  U  [7-„,r„+ 1/n)  . 

n  “1 


Then, 


P(r(r)  #  r(r))  =  P 


"^NU)  •  ^'V(0 


1 

N(t) 


SP(r-7V(o  ^  e)+/’(A/(0  ^  1/0 


for  €  arbitrary.  Letting  r  -  we  find  that  lim  /’(FCr)  £  r(r))  =  1  -  exp(-e)  (recall  that 

the  equilibrium  age  distribution  of  N  is  exponential  with  mean  1).  Since  €  was  arbitrary,  it 
follows  that  /*(f(f)  #  r(r))  -0  as  r  -x.  Then,  it  is  evident  that  f(r)  =>  a  as  r  -x, 
since  r(r)  -*  cr  w.p.l  as  /  -  x.  Hence,  r(0  is  weakly  consistent  for  a. 

Now,  in  the  setting  of  Example  1  using  r(r), 


7'i(«)  =  inf 

✓ 

raO;  z(8)  +  a(f) 

\ 

^  e 

Put  a(f)  =»  1/r.  Then,  clearly  z(6)(f(5)/v7~  +  l/s)  a  z(8)/f  for  5  s  r,  so 
fi(z(8)/f)  s  t.  On  the  other  hand,  f(rAr(,)+i)  ®  0,  so  ri(z(8)/f)  s  rA^(,)+i.  By  the 
SLLN,  r“*riV(o+i  -  1  w.p.l  as  r  -X,  Hence,  fi(z(8)/r)  —  t  w.p.l  as  r  -x.  Thus,  the 
stopping  rule  is  asymptotically  independent  of  the  scaling  constant  F.  As  a  consequence, 
formation  of  asymptotically  valid  confidence  intervals  is  impossible.  In  fact,  even  the 
asymptotic  scaling  of  the  rule  is  incorrect.  It  is  well  known  that  for  estimation  problems 
of  the  type  described  in  Example  1,  the  amount  of  simulation  time  required  to  obtain  an 
absolute  precision  of  order  e  is  of  order  c~^,  whereas  the  stopping  rule  Fife)  based  on 
F(r)  in  (4.1)  yields  a  termination  time  of  order 
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5.  PROOFS 

Proof  of  Proposition  1.  Since  F  is  nonsingular, 

Pia  i  C(f))  =  PiT-^tHYit)  -a)iA), 
but 

r"'r^(y(r)-a)  =  r"’r^(l)  =  «3/(l)  as  r-x. 

Since  P(^(l)  ^  3A)  =  0,  it  follows  that 

P(r"'r^(y(r) -a)  €  A) -P(^(l)  ^  A)  =  1-S  as  r-x; 
see  Theorem  2.1  of  Billingsley  (1968). 

Proof  of  Proposition  2.  By  (2.1)  and  Theorem  4.4  of  Billingsley  (1968), 

[r(r),  -  CL)]^  [T,T%1)]  as  r-x. 

Then  we  can  apply  the  continuous  mapping  theorem,  Theorem  S.l  of  Billingsley  (1968),  to 
deduce  that 

r(f)"‘r^(y(0  -  a)  =>  r'‘r®9(l)  *  '?/(l)  as  f-x. 

To  apply  the  continuous  mapping  theorem,  note  that  matrix  inversion  is  continuous  at  all 
nonsingular  limits.  (To  see  this,  note  that  the  determinant  is  a  continuous  function  of  the 
matrix  elements.)  The  rest  of  the  proof  is  identical  to  that  of  Proposition  1. 

Proof  of  Theorem  1.  Let  V(/)  ■  +  a(t).  By  the  spatial  invariance  and  the 

scaling  properties  of  Lebesgue  measure  m, 

miYit)  -  -  mf-f-^FfOA)''*^  «  f">m(r(f)A)'''^. 

Since  A  is  a  bounded  set,  r(r)A  is  contained  in  a  bounded  set  for  all  sufficiently  large  r 
w.p.l.  It  then  follows  from  the  bounded  convergence  theorem  that 
»i(r(/)A)'''*  -  mfFA)*'^  w.p.l  as  r  -  X.  Recalling  that  ait)  *  oit~''),  we  conclude  that 
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(5.1)  r^VCr) >  0  w.p.l  as  r  -  *  , 

since  m{A)  >  0  and  T  is  non-singular.  This  establishes  part  (i).  By  definition  of  TjCe), 
V(ri(€)-1)  >  6  and  there  exists  a  random  variable  Z(e)  with  0  <  Z(t)  <  1  such  that 
V(ri(e)+Z(€))  s  €.  (Note  that  V(t)  is  not  necessarily  monotone.)  By  (5.1)  and  the  fact 
that7i(€)-se  w.p.l  as  eiO, 

Um67i(€)Y  <  ih^  ri(€)^V(ri(€)-l)  =  m(rA)’^'^  w.p.l. 

ctO  ctO 

Similarly, 

limeri(€)>  a  Ijm  ri(£)>  VfTjfe)  +  Z(€))  =  m(rA)’'‘'  w.p.l. 

€i0  tiO 

This  proves  part  (ii). 

For  part  (iii),  note  that  m(C(Ti(e)))*^‘*  =  rj(e)“'''m(r(ri(£))A)^''‘^  and  recall  that 
m(r(r)A)  -  m(rA)  w.p.l  as  r  -  so  that  m(r(ri(e))A)  -  m(rA)  w.p.l.  as  €  -  0.  By 
(ii),  €“' ri(€)“‘'' - /n(rA)~''‘*.  Hence, 

e"'m(C(ri(€)))'^‘^  =  €"'ri(€)-Ym(r(ri(€))A)^^‘^ -m(rA)"'^‘^m(rA)^^'^  =  1. 

To  obtain  part  (iv),  let  3  =  m(rA)‘'‘*  and  set  T,(r)  *  ri(e)c''^P“''^f,  r  s  0.  Since 
T,  e  as  c  iO,  where  e(f)  *  r,  it  follows  from  the  FCLT  (2.1)  and  a  standard  random- 
time-change  argument,  p.  144  Billingsley  (1968),  that 

(5.2) 
where 

(5.3)  ‘3;,i/-,p-./,(T,(l))  -  p«-‘(F(ri(c))  -  a)  . 


To  establish  (v),  note  that 
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(5.4)  />(a€C(ri(€)))  =  p(y(ri(€))  -  a€ri(€)->r(ri(e))A) 

=  P((3-Uri(€)^r(ri(€))-i3e-'[r(7,(€))  -  al^A;  det(r(ri(€))  #  0) 

+  PinTxit))  -  a€ri(€)-^r(ri(e))A;  det(r(7i(e)))  =  0). 

Since  /*(det(r(7i(€)))  =  0)  -  0  as  €-0,  the  second  term  converges  to  0.  By  a 
convergence-together  argument,  Theorem  4.1  of  Billingsley  (1968),  the  first  term  has  the 
same  limit  as  POr"^€~*[y(ri(€))  -  ot]€A),  which  is  P('^y(l)  €A)  =  1  -  8  by  part  (iv) 
because  P(dA)  =  0;  see  Theorem  2.1  of  Billingsley  (1968)  and  (2.2).  To  establish  *he 
convergence  together,  note  that  P(det(r(7'i(€)))  #  0)  -  1  as  €-0  and 
3"i€7'i(€)^r(ri(€)))“*  w.p.i  as6*o. 

Proof  of  Theorem  2.  For  part  (i),  modify  the  proof  of  part  (i)  of  Theorem  1.  First,  apply 
the  assumed  convergence  of  r(f/€)  in  D(0,  *)  as  etO  to  obtain 
m(r(f/€)A)'^‘*  =>  m(rA)*'^  in  D(0,*)  as  €*0.  For  this  purpose,  use  the  w.p.i 
convergence  representation  of  convergence  in  probability;  e.g.,  see  p.  68  of  Whitt  (1980). 
(Consider  a  sequence  of  random  elements  of  a  separable  metric  space.  Then  X„  ^  X 
if  and  only  if  every  subsequence  of  {2^^}  has  a  further  subsequence  converging  w.p.i  to  X.) 
Then,  for  any  w.p.i  convergent  subsequence  of  {r(r/e):  0  <  e  <  1}  converging  to  F, 
note  that  the  intersection  and  union  of  r(f/€)A  over  t  €[ro.?i]  for  0  <  to  <  both 
approach  FA  w.p.i  as  e  -  0.  (Since  the  limit  is  a  continuous  function,  convergence  in 
D(0, 3c)  is  equivalent  to  uniform  convergence  on  bounded  intervals.)  Then  apply  the 
standard  bounded  convergence  theorem  with  the  w.p.i  convergence  to  get 
m(F(r/e)A)*'‘^  -  m(FA)''‘^  for  this  subsequence  uniformly  for  r€[fo.^i]-  This  yields 
w.p.i  convergence  in  D(0,x),  Finally,  since  the  same  limit  is  obtained  for  every  w.p.i 
convergent  subsequence,  we  have  the  desired  convergence  in  probability.  Since 
e"'*'  a(r/e)  -  0  uniformly  in  r  for  r  >  to  >  0,  we  obtain  the  following  analog  of  (5.1) 
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(€/r)  ■>' V(r/€)  ^  in  D(0,*)  as  e-0, 

which  is  equivalent  to  what  is  to  be  proved. 

For  part  (ii)  apply  the  continuous  mapping  theorem.  Theorem  5.1  of  Billingsley  (1968) 
with  the  inverse  mapping;  see  Section  7  of  Whitt  (1980).  The  inverse  map  there  is 
x~^(r)  =  inf  {s  2:  0:  x(j)  >  r}.  r  >  0,  but  the  results  also  apply  to  first  passage  times  of 
the  form  (2.3).  Note  that 

It)  =  inf  a  0;  e""'  Visit)  s  /“’} 

=5>  inf  {i  >  0:  s-'<  mirA)^'‘‘  <  r"'}  =  m(rA)*'^'' . 

so  that  the  same  limit  holds  for  Titit). 

For  part  (iii),  apply  the  continuous  mapping  theorem  with  the  composition  map,  using 
Theorem  3.1  of  Whitt  (1980),  with  parts  (i)  and  (ii)  above.  In  particular,  use 

e''m(C(r/e’'^))  =>  in  D(0,*) 

with  (ii)  to  obtain  (iii).  Similarly,  use  the  continuous  mapping  theorem  with  the 
composition  map  to  obtain  (iv),  now  drawing  on  (2.1)  and  (ii).  To  obtain  the  equivalent 
forms  of  the  limit,  note  that  %/cit)  *  c'>%icr),  so  that  r*5/(r)  =  c^T'^iKcr)  for  any  scalar 
c.  Finally,  to  obtain  part  (v),  apply  the  continuous  mapping  theorem  with  the  projection 
map  at  r  *  1  to  obtain  the  ordinary  CLT  in  part  (iv)  of  Theorem  1.  Then  apply  the 
argument  for  the  proof  of  (v)  in  Theorem  1. 

Proof  of  Theorem  4.  It  is  shown  in  Glynn  and  Heidelberger  (1989)  that 
max{||X<;, II :  1  s  f  s  n} -» 0  w.p.l  as  n-*.  Hence,  in  expanding  g(X<;,)  in  a 
Taylor  series  about  X„,  namely 

g(Xta)  -  g(X„) +  Vga„y iXo,-X„)  , 

we  may  assert  that  max  ||fi.  -  it||  -  0  w.p.l  as  n  -  Then, 

'  s  /  s  n  r-u  r 
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ai(n)-Y„  =  [na(n)-  >-l)a, {«)]-«  ^  2  ~ 

y-i 

=  (n-l)n~*  OLjin)  -  (n-l)ai(n) 

=  (n-l)n-'  + 

-  (n-l)[«(^„)  +  V^(^,„)'(X,>,-X„)] 


=  («-!)«-'  '2^gaj„y(Xjn-xj  ~  (n-i)Vg(ii„y(Xi„-Xn) . 

7-1 


However, 


(n-l)iXj„-X„)  =  {n-l)Xj„-nXn+X„  = 


2^*-  ^x,+x„  =  X„-Xj, 

*-l  *-l 

k*j 


so  that 


(5.5) 


a,(n)  -Yn  =  n 


-l 


7-1 


The  first  term  on  the  right-hand  side  of  (S.S)  may  be  written  as 

n''Vg(n)'  2  (^«  -^7)  +  "■*  i(^«(€7nV  -  Vg(^JL)')(X„  -Xy) 
7-1  7-1 

=  i(Vg(€y„)-Vg(n))'(X„-Xy)  -  P„. 

7-1 


LciMn  =  max  ||Vg({y„)  -  Vg(M,)  ||.  Then, 


IPJ  =s  Af„n-‘  idIXjl  +  lIXyll)  ^O-dlull  +£||X|1)  =  0  w.p.l  as  «  -*. 
7-1 


Thus, 

a,{n)-Y„  =  Vg(n)'(X,-X„)  -«-  -  Vg(jx)]'(X,  - X„)  +  . 

Let  p,„  =  Vg(^t„)  -  Vg(ti)  and  =  X,  -X„.  Then, 
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(5.6) 


cr3(n)  =  V^f(n)'  1  2  W'.n  1  2  P/nW^z/'V/.P/n 

n  i-i  ”  /-I 

+  PJ  +  23„-  i  3l„  W,,  +  2Vg(^l)'-^-  2  . 


"  zT, 


"  z-i 


Also, 


(5.7)  1  2  I1W^Z„WJJ|  :S  1  2  [IIMII  +  IMII  +  ll^z^nll  +  \\XX 

”  Z-l  ”  Z-1 


n  I 


^£[||XX'1I+  ||nXMI+ ||Xn'||+  ll^t^i'll]  w.p.las  n^x 


and 

(5.8)  -i-i  IIW'z^ll  -i[|lX,ll+  ||X„||]-£|1X11+  llixll  w.p.l  as  n  -x. 

”  z-i  "  z-i 

Since  M„  ^0  w.p.l  as  n  -  x,  it  follows  that  ll^tell  -0  uniformly  in  i  w.p.l.  Hence,  it  is 
evident  from  (5.7)  and  (5.8)  that  all  the  terms  in  (5.6)  involving  and  converge  to 

zero  w.p.l,  but  the  first  term  on  the  right-hand  side  of  (5.6)  clearly  converges  to 
cr^  =  Vg(p.)'CVg(p,),  completing  the  proof. 
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